Measure distance, and explore neighboring points on a map.
Introduction
In this tutorial, you’ll explore several techniques for proximity analysis. In particular, you’ll learn how to do such things as:
measure the distance between points on a map, and
select all points within some radius of a feature.
import foliumfrom folium import Marker, GeoJsonfrom folium.plugins import HeatMapimport pandas as pdimport geopandas as gpd
C:\Users\heeyoung\.conda\envs\quarto\lib\site-packages\geopandas\_compat.py:123: UserWarning: The Shapely GEOS version (3.10.1-CAPI-1.16.0) is incompatible with the GEOS version PyGEOS was compiled with (3.10.4-CAPI-1.16.2). Conversions between both will be slow.
warnings.warn(
You’ll work with a dataset from the US Environmental Protection Agency (EPA) that tracks releases of toxic chemicals in Philadelphia, Pennsylvania, USA.
To measure distances between points from two different GeoDataFrames, we first have to make sure that they use the same coordinate reference system (CRS). Thankfully, this is the case here, where both use EPSG 2272.
print(stations.crs)print(releases.crs)
epsg:2272
epsg:2272
We also check the CRS to see which units it uses (meters, feet, or something else). In this case, EPSG 2272 has units of feet. (If you like, you can check this here.)
It’s relatively straightforward to compute distances in GeoPandas. The code cell below calculates the distance (in feet) between a relatively recent release incident in recent_release and every station in the stations GeoDataFrame.
# Select one release incident in particularrecent_release = releases.iloc[360]# Measure distance from release to each stationdistances = stations.geometry.distance(recent_release.geometry)distances
Using the calculated distances, we can obtain statistics like the mean distance to each station.
print('Mean distance to monitoring stations: {} feet'.format(distances.mean()))
Mean distance to monitoring stations: 33516.28487007786 feet
Or, we can get the closest monitoring station.
print('Closest monitoring station ({} feet):'.format(distances.min()))print(stations.iloc[distances.idxmin()][["ADDRESS", "LATITUDE", "LONGITUDE"]])
Closest monitoring station (3780.623590556444 feet):
ADDRESS 3100 Penrose Ferry Road
LATITUDE 39.91279
LONGITUDE -75.185448
Name: 9, dtype: object
Creating a buffer
If we want to understand all points on a map that are some radius away from a point, the simplest way is to create a buffer.
The code cell below creates a GeoSeries two_mile_buffer containing 12 different Polygon objects. Each polygon is a buffer of 2 miles (or, 2*5280 feet) around a different air monitoring station.
We use folium.GeoJson() to plot each polygon on a map.
Note that since folium requires coordinates in latitude and longitude, we have to convert the CRS to EPSG 4326 before plotting.
# Create map with release incidents and monitoring stationsm = folium.Map(location=[39.9526,-75.1652], zoom_start=11)HeatMap(data=releases[['LATITUDE', 'LONGITUDE']], radius=15).add_to(m)for idx, row in stations.iterrows(): Marker([row['LATITUDE'], row['LONGITUDE']]).add_to(m)# Plot each polygon on the mapGeoJson(two_mile_buffer.to_crs(epsg=4326)).add_to(m)# Show the mapm
Make this Notebook Trusted to load map: File -> Trust Notebook
Now, to test if a toxic release occurred within 2 miles of any monitoring station, we could run 12 different tests for each polygon (to check individually if it contains the point).
But a more efficient way is to first collapse all of the polygons into a MultiPolygon object. We do this with the unary_union attribute.
# Turn group of polygons into single multipolygonmy_union = two_mile_buffer.geometry.unary_unionprint('Type:', type(my_union))# Show the MultiPolygon objectmy_union
We use the contains() method to check if the multipolygon contains a point. We’ll use the release incident from earlier in the tutorial, which we know is roughly 3781 feet to the closest monitoring station.
# The closest station is less than two miles awaymy_union.contains(releases.iloc[360].geometry)
True
But not all releases occured within two miles of an air monitoring station!
# The closest station is more than two miles awaymy_union.contains(releases.iloc[358].geometry)
False
Your turn
In the final exercise, you’ll investigate hospital coverage in New York City.
Exercise
introduction
위기 대응 팀의 일원인 여러분은 뉴욕시에서 발생한 충돌 사고에 대해 병원들이 어떻게 대응하고 있는지 파악하고자 합니다.
시작하기 전에 아래 코드 셀을 실행하여 모든 것을 설정하세요.
import mathimport geopandas as gpdimport pandas as pdfrom shapely.geometry import MultiPolygonimport foliumfrom folium import Choropleth, Markerfrom folium.plugins import HeatMap, MarkerCluster
1) Visualize the collision data.
아래 코드 셀을 실행하여 2013~2018년 주요 자동차 충돌을 추적하는 GeoDataFrame을 로드합니다.
“위도” 및 “경도” 열을 사용하여 대화형 맵을 만들어 충돌 데이터를 시각화합니다. 어떤 유형의 맵이 가장 효과적이라고 생각하시나요?
m_1 = folium.Map(location=[40.7, -74], zoom_start=11) # Your code here: Visualize the collision dataHeatMap(data=collisions[['LATITUDE', 'LONGITUDE']], radius=9).add_to(m_1)# show the mapm_1
Make this Notebook Trusted to load map: File -> Trust Notebook
m_2 = folium.Map(location=[40.7, -74], zoom_start=11) # Your code here: Visualize the hospital locationsfor idx, row in hospitals.iterrows(): Marker([row['latitude'], row['longitude']], popup=row['name']).add_to(m_2)# show the mapm_2
Make this Notebook Trusted to load map: File -> Trust Notebook
3) When was the closest hospital more than 10 kilometers away?
가장 가까운 병원에서 10km 이상 떨어진 곳에서 발생한 충돌의 모든 행을 포함하는 외부_범위 데이터 프레임을 만듭니다.
병원과 충돌 모두 좌표 참조 시스템으로 EPSG 2263을 사용하며, EPSG 2263에는 미터 단위가 있습니다.
다음 코드 셀은 가장 가까운 병원에서 10km 이상 떨어진 곳에서 발생한 충돌의 비율을 계산합니다.
percentage =round(100*len(outside_range)/len(collisions), 2)print("Percentage of collisions more than 10 km away from the closest hospital: {}%".format(percentage))
Percentage of collisions more than 10 km away from the closest hospital: 15.12%
4) Make a recommender.
먼 곳에서 충돌 사고가 발생하면 부상자를 가장 가까운 병원으로 이송하는 것이 더욱 중요해집니다.
이를 염두에 두고 충돌 위치(EPSG 2263)를 입력으로 사용하는
충돌 위치(EPSG 2263)를 입력으로 받습니다,
가장 가까운 병원을 찾고(거리 계산은 EPSG 2263에서 수행됨), 그리고
가장 가까운 병원의 이름을 반환합니다.
def best_hospital(collision_location): idx_min = hospitals.geometry.distance(collision_location).idxmin() my_hospital = hospitals.iloc[idx_min] name = my_hospital["name"]return name# Test your function: this should suggest CALVARY HOSPITAL INCprint(best_hospital(outside_range.geometry.iloc[0]))
CALVARY HOSPITAL INC
5) Which hospital is under the highest demand?
외부_범위 데이터 프레임의 충돌만 고려할 때 가장 권장되는 병원은 어디일까요?
답은 4)에서 생성한 함수가 반환한 병원 이름과 정확히 일치하는 Python 문자열이어야 합니다.
Make this Notebook Trusted to load map: File -> Trust Notebook
지도의 아무 곳이나 클릭하면 해당 위치가 위도와 경도로 표시된 팝업이 표시됩니다.
뉴욕시에서 새로운 병원 두 곳의 위치를 결정하는 데 도움을 요청하는 연락이 왔습니다. 특히 3단계)에서 계산된 비율을 10% 미만으로 낮추기 위해 위치를 식별하는 데 도움을 요청합니다. 지도를 사용하여(그리고 구역 법이나 병원을 짓기 위해 어떤 건물을 철거해야 하는지에 대해 걱정하지 않고) 이 목표를 달성하는 데 도움이 될 두 위치를 식별할 수 있나요?
병원 1의 제안된 위도와 경도를 각각 lat_1과 long_1에 넣습니다. (병원 2도 마찬가지입니다.)
그런 다음 나머지 셀을 그대로 실행하여 새 병원의 효과를 확인합니다. 두 개의 새로운 병원으로 인해 비율이 10% 미만이 되면 정답으로 표시됩니다.
# Your answer here: proposed location of hospital 1lat_1 =40.6714long_1 =-73.8492# Your answer here: proposed location of hospital 2lat_2 =40.6702long_2 =-73.7612# Do not modify the code below this linetry: new_df = pd.DataFrame( {'Latitude': [lat_1, lat_2],'Longitude': [long_1, long_2]}) new_gdf = gpd.GeoDataFrame(new_df, geometry=gpd.points_from_xy(new_df.Longitude, new_df.Latitude)) new_gdf.crs = {'init' :'epsg:4326'} new_gdf = new_gdf.to_crs(epsg=2263)# get new percentage new_coverage = gpd.GeoDataFrame(geometry=new_gdf.geometry).buffer(10000) new_my_union = new_coverage.geometry.unary_union new_outside_range = outside_range.loc[~outside_range["geometry"].apply(lambda x: new_my_union.contains(x))] new_percentage =round(100*len(new_outside_range)/len(collisions), 2)print("(NEW) Percentage of collisions more than 10 km away from the closest hospital: {}%".format(new_percentage))# Did you help the city to meet its goal? q_6.check()# make the map m = folium.Map(location=[40.7, -74], zoom_start=11) folium.GeoJson(coverage.geometry.to_crs(epsg=4326)).add_to(m) folium.GeoJson(new_coverage.geometry.to_crs(epsg=4326)).add_to(m)for idx, row in new_gdf.iterrows(): Marker([row['Latitude'], row['Longitude']]).add_to(m) HeatMap(data=new_outside_range[['LATITUDE', 'LONGITUDE']], radius=9).add_to(m) folium.LatLngPopup().add_to(m) display(m)except: m
C:\Users\heeyoung\.conda\envs\quarto\lib\site-packages\pyproj\crs\crs.py:141: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
in_crs_string = _prepare_from_proj_string(in_crs_string)
(NEW) Percentage of collisions more than 10 km away from the closest hospital: 9.12%